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Abstract — A purely relational account of the true XQuery 
semantics can turn any relational database system into an 
XQuery processor. Compiling nested expressions of the fully 
compositional XQuery language, however, yields odd algebraic 
plan shapes featuring scattered distributions of join operators 
that currently overwhelm commercial SQL query optimizers. 

This work rewrites such plans before submission to the rela- 
tional database back-end. Once cast into the shape of join graphs, 
we have found off-the-shelf relational query optimizers — the B- 
tree indexing subsystem and join tree planner, in particular — 
to cope and even be autonomously capable of "reinventing" 
advanced processing strategies that have originally been devised 
specifically for the XQuery domain, e.g., XPath step reordering, 
axis reversal, and path stitching. Performance assessments pro- 
vide evidence that relational query engines are among the most 
versatile and efficient XQuery processors readily available today. 
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I. Introduction 

SQL query optimizers strive to produce query plans whose 
primary components are join graphs — ^bundles of relations 
interconnected by join predicates — while a secondary, pe- 
ripheral plan tail performs further filtering, grouping, and 
sorting. Plans of this particular type are subject to effective 
optimization strategies that, taking into account the available 
indexes and applicable join methods, derive equivalent join 
trees, ideally with a left-deep profile to enable pipelining. 
For more than 30 years now, relational query processing 
infrastructure has been tuned to excel at the evaluation of plans 
of this shape. 

SQL's rather rigid syntactical block structure facilitates its 
compilation into join graphs. The compilation of truly compo- 
sitional expression-oriented languages like XQuery, however, 
may yield plans of unfamiliar shape [11]. The arbitrary nesting 
of for loops (iteration over ordered item sequences), in 
particular, leads to plans in which join and sort operators as 
well as duplicate elimination occur throughout. Such plans 
overwhelm current commercial SQL query optimizers: the 
numerous occurrences of sort operators block join operator 
movement, effectively separate the plan into fragments, and 
ultimately lead to unacceptable query performance. 

Here, we propose a plan rewriting procedure that derives 
join graphs from plans generated by the XQuery compiler de- 

'This article is an extended version of a paper published in the Proceedings of 
the 25th IEEE International Conference on Data Engineering (ICDE 2009). 



scribed in [11]. The XQuery order and duplicate semantics are 
preserved. The resulting plan may be equivalently expressed 
as a single SELECT-DISTINCT-FRDM-WHERE-ORDERBY block 
to be submitted for execution by an off-the-shelf RDBMS. 
The database system then evaluates this query over a schema- 
oblivious tabular encoding of XML documents to compute the 
encoding of the resulting XML node sequence (which may 
then be serialized to yield the expected XML text). 

In this work we restrict ourselves to the XQuery Core 
fragment, defined by the grammar in Fig. [T] that admits the 
orthogonal nesting of for loops over XML node sequences 
(of type nodeO*), supports the 12 axes of XQuery's full 
axis feature, arbitrary XPath name and kind tests, as well as 
general comparisons in conditional expressions whose else 
clause yields the empty sequence (). As such, the fragment 
is considerably more expressive than the widely considered 
twig queries [5], [6] and can be characterized as XQuery's 
data-bound "workhorse": XQuery uses this fragment to collect, 
filter, and join nodes from participating XML documents. 

Isolating the join graph implied by the input XQuery ex- 
pression lets the relational database query optimizer face a 
problem known inside out despite the source language not 
being SQL: in essence, the join graph isolation process emits a 
bundle of self-joins over the tabular XML document encoding 
connected by conjunctive equality and range predicates. Most 
interestingly, we have found relational query optimizers to be 
autonomously capable of translating these join graphs into 
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join trees that, effectively, (1) perform cost-based shuffling of 
the evaluation order of XPath location steps and predicates, 
(2) exploit XPath axis reversal (e.g., trade ancestor for 
descendant), and (3) break up and stitch complex path 
expressions. In recent years, all of these have been described as 
specific evaluation and optimization techniques in the XPath 
and XQuery domain [5], [14], [16] — ^here, instead, they are 
the automatic result of join tree planning solely based on the 
availability of vanilla B-tree indexes and associated statistics. 
The resulting plans fully exploit the relational database kernel 
infrastructure, effectively turning the RDBMS into an XQuery 
processor that can perfectly cope with large XML instances 
(of size 100 MB and beyond). 

We plugged join graph isolation into Pathfinde^ — a full- 
fledged compiler for the complete XQuery language specifi- 
cation targeting conventional relational database back-ends — 
and observed significant query execution time improvements 
for popular XQuery benchmarks, e.g., XMark or the query 
section of TPoX [15], [19]. 

We start to explore this form of XQuery join graph isolation 
in Section [n] where we review the compiler's algebraic target 
language, tabular XML document encodings, and join-based 
compilation rules for XPath location steps, nested for loops. 



and conditionals. The rewriting procedure of Section III then 
isolates the join graphs buried in the initial compiled plans. 
Cast in terms of an SQL query, IBM DB2 V9's relational query 
processor is able turn these graphs into join trees which, ef- 
fectively, implement a series of otherwise XQuery- and XPath- 
specific optimizations. A further quantitative experimental 
assessment demonstrates that DB2 V9's built-in pureXML^M 
XQuery processor cuiTently faces a serious challenger with its 
relational self if the latter is equipped with the join graph- 
isolating compiler (Section |IV| l. Sections |V] and |VT] conclude 
this paper with reviews of related efforts and work in flux. 

II. Join-Based XQuery Semantics 

To prepare join graph isolation, the compiler translates the 
XQuery fragment of Fig. [T] into intermediate DAG-shaped 
plans over the table algebra of Table [l] This particularly 
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<open_auction id="l"> 
<initial> 
15 

</initial> 
<bidder> 
<time>18 : 43</time> 
<increase> 

4.20 
</increase> 
</bidder> 
</open_auction> 



Fig. 2. Encoding of the infoset of XML document auction. xml. Column 
data carries the nodes' typed decimal values. 



simple algebra dialect has been designed to match the ca- 
pabilities of SQL query engines: operators consume tables 
(not relations) and duplicate row elimination is explicit (in 
terms of S). The row rank operator gia:(bi....,b„) exactly mimics 
SQL:1999's RANKO OVER (ORDERBY bi, . . . , b„) AS a and is 
primarily used to account for XQuery's pervasive sequence 
order notion. The attach operator @a:c(e) abbreviates e x ^, 
where the right-hand side argument denotes a singleton literal 
table. Operator 9 marks the root of the plan DAG, delivering 
those rows that encode the resulting XML node sequence. 
Below we will see how the join operator n assumes a central 
role in the translation of XPath location steps, for loops, and 
conditional expressions. 

A. XML Infoset Encoding 

An encoding of persistent XML infosets is provided via 
the designated table doc. In principle, any schema-oblivious 
node-based encoding of XML nodes that admits the evaluation 
of XPath node tests and axis steps fits the bill (e.g., ORD- 
PATH [17]). The following uses one such row-based format 
in which, for each node v, key column pre holds u's unique 
document order rank to form — together with columns size 
(number of nodes in subtree below v) and level (length of 
path from v to its document root node) — an encoding of the 
XML tree structure (Fig. [2] and [12]). XPath kind and name 
tests access columns kind and name — multiple occurrences of 
value DOC in column kind indicate that table doc hosts several 
trees, distinguishable by their document URIs (in column 
name). For nodes with size ^ 1, table doc supports value- 
based node access in terms of two columns that carry the 
node's untyped string value [8, §3.5.2] and, if applicable, the 
result of a cast to type xs : decima]j^ (columns value and data, 
respectively). This tabular XML infoset representation may be 
efficiently populated (during a single parsing pass over the 
XML document text) and serialized again (via a table scan in 
pre order). 

B. XPath Location Steps 

Further, this encoding has already been shown to ad- 
mit the efficient join-based evaluation of location steps 

'in the interest of space, we omit a discussion of the numerous further 
XML Schema built-in data types. 
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child pre^ < pre ^ pre^, + sizeo A levelo + 1 = level 

descendant pre^ < pre ^ pre^, + sizeo 

ancestor pre < pre^ ^ pre + size 

following pre,, + sizeo < pre 



Fig. 3. Predicates implementing the semantics of XPath kind and name 
tests — expressed in sequence type syntax [4, § 2.5.3] — and axes (exceipt). o 
marks the properties of the context node(s). 



a: -.n along all 12 XPath axes a [12]. While the structural 
node relationship expressed by a maps into a conjunctive 
range join predicate axis (a) over columns pre, size, level, 
the step's kind and/or name test n yields equality predi- 
cates over kind and name (Fig. [3]l. Consider the three-step 
path Qo — doc (" auction. xml") /descendant :: bidder/ 
child: :*/child: :text() over the document of Fig. |2] 
To perform the final child: :text() step, which will have 
context elements time and increase, the database system 
evaluates a join between the document encoding and the step's 
context nodes (the query yields the pre ranks of the two 
resulting text nodes): 



n [a ^ (doc) M 

item:\ feindt(text ) A arris (child) 

pre ^ namer(text CJ ) 



With their ability to perform range scans, regular B-tree 
indexes, built over table doc, perfectly support this style of 
location step evaluation [12]. 

C. A Loop-Lifting XQuery Compiler 

From [11] we adopt a view of the dynamic XQuery seman- 
tics, loop lifting, that revolves around the for loop as the core 
language construct. Any subexpression e is considered to be 
iteratively evaluated inside its innermost enclosing for loop. 
For the XQuery fragment of Fig. [T] each iterated evaluation 
of e yields a (possibly empty) ordered sequence of nodes. 
To reflect this, we compile e into an algebraic plan that 
returns a ternary table with schema iter| pos| item: a row [i, v] 
indicates that, in iteration i, the evaluation of e returned a 
sequence containing a node with pre rank v at sequence 
position p. 

The inference rules |Doc| |Ddo[ |Step| [IfI |Comp| |For| and 
|Var| (taken from [11] and reproduced in Appendix |A]) form 
a compositional algebraic compilation scheme for the XQuery 
dialect of Fig. [T] The rule set expects to see the input query 
after XQuery Core normalization: the enforcement of duplicate 
node removal and document order after XPath location steps 
(via the application of f s:distinct-doc-order(-), abbre- 
viated to fs:ddo( ) in the following) and the computation of 



effective Boolean values in conditionals (via fn:boolean( )) 
is explicit [8, §4.2.1 and §3.4.3]. 

D. The Compositionality Threat 

To obtain an impression of typical plan features, we compile 
doc ("auction. xml") 



/descendant: : open_auction [bidder] . 
After XQuery Core normalization, this query reads 



(Qi) 



for $x in f s :ddo(doc("auction.xml") 

/descendant: : open_auction) 
return if (fn:booleaji(f s:ddo($x/child: :bidder))) 
then $x else () . 



Fig. |4] shows the initial plan for Qi Since the inference 



rules of Fig. 13 implement & fully compositional compilation 
scheme, we can readily identify how the subexpressions of Qi 
contribute to the overall plan (to this end, observe the gray plan 
sections all of which yield tables with columns iter|pos|item). 
XQuery is a functional expression-oriented language in which 
subexpressions are stacked upon each other to form complex 
queries. The tall plan profile with its stacked sections — 
reaching from a single instance of table doc (serving all node 
references) to the serialization point 9 — directly reflects this 
orthogonal nesting of expressions. 

Note, though, how this artifact of both, compositional language 
and compilation scheme, leads to plans whose shapes differ 
considerably from the ideal join graph + plan tail we have 
identified earlier. Instead, join operators occur in sections 
distributed all over the plan. A similar distribution can be 
observed for the blocking operators 5 and g (duplicate elimina- 
tion and row ranking). This is quite unlike the algebraic plans 
produced by SQL SELECT-FROM-WHERE block compilation. 

The omnipresence of blocking operators obstructs join op- 
erator movement and planning and leads industrial-strength 
optimizers, e.g., IBM DB2 UDB V9, to execute the plan in 
stages that read and then again materialize temporary tables. 
In the following we will therefore follow a different route and 
instead reshape the plan into a join graph that becomes subject 
to efficient one-shot execution by the SQL database back- 



end. (Section IV will show that join graph isolation for Q 



improves the evaluation time by a factor of 5.) 

III. XQuery Join Graph Isolation 

In a nutshell, join graph isolation pursues a strategy that 
moves the blocking operators {g and 5) into plan tail positions 
and, at the same time, pushes join operators down into the 
plan. This rewriting process will isolate a plan section, the 
join graph, that is populated with references to the infoset 
encoding table doc, joins, and further pipelineable operators, 
like projection, selection, and column attachment (tt, a, @). 

The ultimate goal is to form a new DAG that may readily 
be translated into a single SELECT-DISTINCT-FROM-WHERE- 
ORDER BY block in which 

(1) the FROM clause lists the required doc instances. 
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(2) the WHERE clause specifies a conjunctive self -join predicate 
over doc, reflecting the semantics of XPath location steps 
and predicates, and 

(3) the SELECT-DISTINCT and ORDER BY clauses represent the 
plan tail. 

A. Plan Property Inference 

We account for the unusual tall shape and substantial 
size (of the order of 100 operators and beyond for typical 
benchmark-type queries) of the initial plan DAGs by a peep- 
/lo/e-style rewriting process. For all operators ®, a property 
inference collects relevant information about the plan vicinity 
of ®. The applicability of a rewriting step may then decided by 
inspection of the properties of a single operator (and its closer 
neighborhood) at a time. Tables [ir[|V] define these properties 
and their inference in an operator-by-operator fashion. We 
rely on auxiliary function cols{-) that can determine the 
columns used in a predicate {e.g., co/s(preo + sizCo < pre) = 
{pre^, sizeo, pre}) as well as the columns in the output table of 
a given plan fragment {e.g., coZs^@;ter:i^^^^^ = {iter, pes}). 
Furthermore we use ^ to denote the reachability relation of 
this DAG {e.g., ® ^ 9 for any operator ® in the plan). 

icols This property records the set of input columns strictly 
required to evaluate ® and its upstream plan. At the plan root 
9, the property is seeded with the set {pos, item}, the two 
columns required to represent and serialize the resulting XML 
node sequence. The icols column set is inferred top-down and 
accumulated whenever the DAG-walking inference enters a 
node more than once. 

const A set with elements of the form a = c, indicating that 
all rows in the table output by ® hold value c in column a. 
Seeded at the plan leaves (instances of doc or literal tables) 
and inferred bottom-up. 

key The set of candidate keys generated by ® (bottom-up), 
set Boolean property set communicates whether the output 
rows of ® will undergo duplicate elimination in the upstream 
plan. Inferred top-down {set is initialized to true for all 
operators but 9). 

B. Isolating Plan Tail and Join Graph 

The isolation process is defined by three subgoals @, @, 
and (S) (described below), attained through a sequence of 
goal-directed applications of the rewrite rules in Fig. |5] Note 
how the rules' premises inspect the inferred plan properties, 
as described above. In addition to these three main goals, 
"house cleaning" is performed by rules defined to simplify 
or remove operators (Rules ([TJ|5|, and ([T3]l). Otherwise, 
the subgoals either strictly move g and 5 towards the plan tail 
or push N down into the plan. Progress and termination thus 
is guaranteed. Goal @ is pursued first. 

@ Establish a single g operator in the plan tail. A single 
instance of the row ranking operator g suffices to correctly 
implement the sequence and document order requirements 
of the overall plan. To this end. Rule ^V2\ trades a g for a 
projection if g ranks over a single column. For the compilation 



for $x in 



if (• • • ) then $x else () 



child: : bidder 




descendan : 



Fig. 4. Initial stacked plan for |Qi| The gray plan sections indicate input 
XQuery subexpressions and applied compilation rules. 



4 



TABLE II 

Top-down inference of property icols for the input(s) of operator ®. 



TABLE III 

BOTTOM-UP INFERENCE OF PROPERTY const FOR OPERATOR ®. 
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TABLE IV 

BOTTOM-UP INFERENCE OF PROPERTY key FOR OPERATOR I 
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TABLE V 

Top-down inference of Boolean property 

set FOR THE INPUT(S) OF OPERATOR ®. 
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of the input(s) of ® 
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e.set <- 6.s6t A set 
e.set <- e.set A set 



Rules |Ddo| and |Step[ which introduce row rankings of this 
form (^?pos:(item))' this effectively means that document order 
determines sequence order — which is indeed the case for 
the result of XPath location steps and fs:ddo( ). All other 
instances of g (£'posj:(5ort,pos)> introduced by Fori are moved 
towards the plan tail via Rules ([l4]i-([T7|i. The premises of 
Rules ( [T4| l and ( [TS] ) are no obstacle here: for the XQuery 
fragment of Fig. [T] the compiler does not emit predicates over 
sequence positions (column pes). Once arrived in the plan tail, 
Rule (TtJi splices the ranking criteria of adjacent g operators. 
Rule (2} finally removes all but the topmost instance of g. 

(s)+@ Establish a single 5 operator in the plan tail + join 
push-down and removal. Duplicate elimination relocation 
and join push-down and removal are intertwined. Fig. |6] 
illustrates the stages of this process (the ( ; represent plan 
sections much like in Fig. [4]). These subgoals target and 
ultimately delete the equi-joins introduced by the compilation 
Rules [Tf| and |For| (the latter is in focus here). 

A join of this type preserves the keys established by #inner 
and thus emits unique rows. The introduction of a new 5 
instance at the top of the plan fragment thus does not alter 
the plan semantics (Rule ([8|, see Figures f a) and (b)i. 
This renders the original instance of 5 obsolete as duplicate 
elimination now occurs upstream (Rule (|6|, Fig. The 
following stages push the join towards the plan base, leaving 



a trail of plan sections that formerly occurred in the join input 
branches (Rule ( [TT| i, Figures ( £)] and |(d)[ ). The condition 
(72 ^ ® in the premise of Rule ( [TT| i prevents its further 
application in the situation of Fig. |^|^d)[ otherwise, the rewrite 
would introduce a cycle in the plan. Instead, Rule (|9]l detects 
that the join has degenerated into a key join over identical 



inputs and thus may be removed (Fig. { e)i. Finally, this 
renders the remaining instance of #inner obsolete (column inner 
not referenced. Rule ([T])). 

C. XQuery in the Guise of SQL SFW-Blocks 

Fig. [Tjdepicts the isolation result for Query Qi (original plan 
shown in Fig. |4]l. The new plan features a bundle of operators 
in which — ^besides instances of tt, a — the only remaining 



joins originate from applications of compilation Rule Step 



implementing the semantics of XPath location steps. The joins 
consume rows from the XML infoset encoding table doc which 
now is the only shared plan node in the DAG. As desired, we 
can also identify the plan tail (in the case of Qi no extra 



row ranking is required since the document order ranks of 
the elements resulting from the descendant : : open_auction 
step — in column pre produced by the topmost tt operator — 
already determine the overall order of the result.) 

Quite unlike the initial plans emitted by the compositional 
compiler, XQuery join graph isolation derives plans that are 
truly indistinguishable from the algebraic plans produced by a 
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a ^ icols a ^ icols a ^ icols {ai, . . . , a„} n ico/,s / 



:(6) 77^ FT r^(7) ^ . ^^TTTTV^ (8) 



cols(q)\{const\icols) (q)) ®{q) ^ S {■^^cols{®{q))) 

{a} e key gi ^ 92 Q2 ^ ?i {a = c, b = c} C const ® ^ {5, #} 92^® {a, b} C co/s(qi) U co/s((j2) 

(9) (lU) 7 r -. r (11) 

91 (/2 ^ 91 qi ^a=h 92 ^ 91 X 92 ®(gi) t^s=b 92 ^ ®(9l ^a^b 92) 

(12) (13) ®eW„S,@,#} a ^ colsip) ^^^^ 

£'a:(b)(9) -* 7ra.b,co;s(9)(9) ^a: (bi , . . . ,b„ > (9) ^ £'a: {bi , . . . ,b„ > \ consi (9) ®(£'a:(bi b„)(9)) ^ £'a:(bi b„)(®(9)) 

®g{Np,x} ai^co;s(p) ^^^^ ^^^^ 

£'a:(bi,...,b„)(9l) ® 92 ^ £'a;<bi,...,b„>(9l ® 92) n^^ci c„ (bi , . . . ,b„ ) (9)) ^ fa: (bi , . . . ,b„ ) (^bi , . . . ,b„ ,ci , . . . ,c„ (9) ) 

-(17) 



£'a:(...,b,,...> (£'bi:(ci,...,c„0(9)) ~* &:(... ,bi_ 1 ,ci ,c„ ,b. + i ) (fib, : (ci , . . . ,c„ > (9)) 

Fig. 5. Join graph isolation transformation (for a rale Ihs rhs, the properties icols, const, set, and key denote the properties of Ihs). 
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Fig. 6. Moving duplicate elimination into the plan tail and join push-down 
(stages shown left to right). 
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regular SQL translator. We thus let an off-the-shelf relational 
database back-end autonomously take over from here. It is 
now reasonable to expect the system to excel at the evaluation 
of the considered XQuery fragment as it will face a familiar 



workload. (This is exactly what we observe in Section IV ) 

Most importantly, the join graphs provide a complete de- 
scription of the input query's true XQuery semantics but do not 
prescribe a particular order of XPath location step or predicate 
evaluation. It is our intention to let the RDBMS decide on 
an evaluation strategy, based on its very own cost model, the 
availability of join algorithms, and supporting index structures. 
As a consequence, it suffices to communicate the join graph 
in form of a standard SQL SELECT-DISTINCT-FROM-WHERE- 
ORDER BY-block — i.e., in a declarative fashion barring any 



XQuery-specific annotation s or si milar clues. For Query Qi_ 
we thus ship the SQL query Q^^^ (Fig. [s]) for execution by the 
database back-end. 

Plan tail. The interaction of for loop iteration and sequence 
order of the final result becomes apparent in the plan tail of 
the following query (traversing XMark data [19] to return the 
names of those auction categories in which expensive items 



Fig. 7. Final plan emitted for |Qi| The separates the plan tail (above) 
from the isolated join bundle (three-fold self-join of table doc). 



were sold at prices beyond $500): 

let $a := doc ("auction. xml") 
for $ca in $a//closed_auction [price > 500], 
$i in $a//item, 

$c in $a//category (.Q2) 
where $ca/itemref /Oitem = $i/@id 

cind $i/incategory/(§category = $c/(§id 
return Sc/nsune 

This example features let bindings and general value com- 
parisons of atomized nodes, two extensions of the language 
fragment of Fig. [T] that have been shown to readily fit into the 
loop-lifting compilation approach [11]. XQuery Core norma- 
lization, compilation and subsequent isolation yields the SQL 
join graph query in Fig. [9] which describes a 12-fold self- 
join over table doc. Note how the ORDER BY and DISTINCT 
clauses — which represent the plan tail — reflect the XQuery 
sequence order and duplicate semantics: 



Order The nesting of the three for loops in Q2 principally 



determines the order of the resulting node sequence: in TJ^ 
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SELECT DISTINCT d2 . * 

FROM doc AS dl , doc AS d2, doc AS d3 
WHERE dl.kind = DOC 

AND dl.name = 'auction.xml' 
AND d2.kind = ELEM 

AND d2.name = ' open_auction' , s 

AND d2 BETWEEN dl.pre + 1 AND dl .pre + dl . size ^^'^ 
AND dS.kind = ELEM 
AND dS.name = 'bidder' 

AND d3 BETWEEN d2.pre + l AND d2 .pre + d2 . size 
AND d2. level + 1 = d3. level 
ORDER BY d2.pre 



TABLE VI 

B-TREE INDEXES PROPOSED BY db2advis. 



Fig. 



SQL encoding of |Qi[s join graph. 



SELECT DISTINCT dl2.*, 

d2.pre AS iteml, d4.pre AS item2, 
dS.pre AS itemS C/nSCILN 
FROM doc AS dl, ... , doc AS dl2 
WHERE . . . 

ORDER BY d2.pre, d4.pre, dS.pre, dl2.pre 



{Ql 



Fig. 9. SQL encoding of|Q2| (focus on plan tail: order, duplicate removal). 



row variables d2, d4, d5 range over closed_auction, 
item, category element nodes, respectively. The docu- 
ment order of the name elements (bound to dl2) is least 
relevant in this example and orders the nodes within each 
iteration of the innermost for loop. 
Duplicates The XPath location step semantics requires du- 
plicate node removal (row variable dl2 appears in the 
DISTINCT clause). Duplicates are retained, however, across 
for loop iterations (keys d2.pre, d4.pre, and dS.pre 
appear in the DISTINCT clause). 

IV. In Laboratory with IBM DB2 V9 (Experiments) 

The SQL language subset used to describe the XQuery 
join graphs — flat self-join chains, simple ordering criteria, 
and no grouping or aggregation — is sufficiently simple to 
let any SQL-capable RDBMS assume the role of a back- 
end for XQuery evaluation. We do not rely on SQL/XML 
functionality, in particular In what follows, we will observe 
how IBM DB2 UDB V9 acts as a runtime for the join-graph- 
isolating compiler. In this context, DB2 V9 appears to be 
especially interesting, because the system 

(1) has the ability to autonomously adapt the design of its 
physical layer, indexes in particular, in response to a given 
workload (this will help to assess whether an RDBMS 
can indeed cope with XQuery specifics, like the pervasive 
sequence and document order notions), and 

(2) features the built-in XQuery processor pureXML''^ which 
implements a "native" XML document storage and specific 
primitives for XPath evaluation, but nevertheless relies on 
the very same database kernel infrastructure. This will pro- 
vide an insightful point of reference for the performance 
assessment of Section HV-B I 

Autonomous index design. The workload produced by the 
join-graph-isolating compiler is completely regular: as long as 



Index key columns 



Index deployment 




<nk^)> ^ik^ <nkip> Aikp) XPath node test and axis step, 
— — — — access document node (doc (•)) 

Atomization, value comparison with 
subsequent/preceding XPath step 
Serialization support (supplies XML 
infoset in document order) 
p:pre, s:pre + size, hievel, k:kind, n:name, v:value, d:data 



the input expressions adhere to the XQuery dialect of Fig. [T| all 
emitted queries will, for example, evaluate predicates against 
ranges with endpoints pre, pre + size and always use column 
pre as ordering criterion which finds perfect support in a 
clustered B-tree on the primary key column pre. (Here, column 
size is exclusively used as a summand in pre + size — we thus 
replaced column size with a computed column that contains 
the sum.) 

Due to this high predictability, we expected the DB2 au- 
tomatic design advisor, db2advis [1], to be able to suggest 
a reasonable, tailored set of vanilla B-tree indexes to support 
the typical XQuery join graph workload. 

To provide the RDBMS with complete information about 
the expected incoming queries, we instructed the compiler to 
make the semantics of the serialization point 9 explicit — this 
adds one extra descendcoit-or-self : :node() step to any 
Query Q, originating in its result node sequence: 

for $x in Q return $x/descendant-or-seIf : :node() 

This produces all XML nodes required to fully serialize the 
result (surfacing as the additional topmost self-join in the join 
plans of Figures [TO] and [TT] ). 

as an representative of the prototypical expected 



For Q 



query workload, the DB2 design advisor suggests the B-tree 



index set of Table VI Configured to exploit an unlimited time 
and index space budget, db2advis proposes indexes that add 
up to a total size of 300 MB for a 110 MB instance of the 
XMark auction.xml document. Due to the regularity of the 
emitted SQL code, the utility of the proposed indexes will be 
high for any workload that exhibits a significant fraction of 
XQuery join graphs. 

Partitioned B-tree index support for XQuery. The majority 



of the index keys proposed in Table VI are prefixed with 
low cardinality column(s), e.g., n, nk, or nik. An XMark 
XML instance features 77 distinct element tag and attribute 
names, regardless of the document size. Similar observations 
apply to the 7 XML node kinds (column kind) and the typical 
XML document height (0 level < 14 for the XMark 
instances). A B-tree that is primarily organized by such a low 
cardinality column will, in consequence, partition the XML 
infoset encoding into few disjoint node sets. Note how, in 
a sense, a name-prefixed index key leads to a B-tree-based 
implementation of the element tag streams, the principal data 
access path used in the so-called twig join algorithms [5], [6]. 
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TABLE VII 
Relevant IBM DB2 plan operators. 



Operator 

[return] 

[^NLJOIN^ 



[Txscan] 

<nlkps> 



Semantics 

Result row 
delivery 

Nested-loop join 
(left leg: outer) 
B-tree scan 

Index access 



Operator Semantics 



SORT 



[hsjoin] 



[tbscan] 

[doc] 



Sort rows (+ dup. 
row elimination) 
Hash join 
(left leg: probe) 
Temporary table 
scan 

XML infoset 
table access 



Partitioned B-trees enjoy a number of desirable properties [10]. 
In a name-partitioned B-tree with key prefix compression, for 
example, each partition will contain its element name exactly 
once (zero redundancy tag name storage). 

The design advisor further suggests an index with key 
vnlkp whose value column prefix supports atorrdzation and the 
general value comparisons between (attribute) nodes featured 
in Q2 A B-tree of this type bears some close resemblance with 
the XPath-specific indexes (CREATE INDEX . . . GENERATE 
KEY USING XMLPATTERN ... AS SQL VARCHAR(n)) em- 



ployed by pureXML™ (Section IV-B 1 



The proposal of the unique clustered index with key pjnvkls 
is owed to the descendant-or-self : : node () step, intro- 
duced to enable XML serialization. A forward scan of this 
index provides access to the full XML infoset for any subtree 
encoded in table doc, in document order. Since db2advis has 
observed that such scans will be the principal (here: only) 
use of this index, it proposes pre as the only key column — 
all other columns (name through pre + size) are specified in 
DB2's INCLUDE(-) index creation clause [1] and thus merely 
occupy space on the index leaf pages. 

A. XPath Continuations 

How exactly does DB2V9's query optimizer deploy the 
indexes proposed by its design advisor companion? An answer 
to this question can be found through an analysis of the plan 
trees generated by the optimizer. We have, in fact, observed a 
few not immediately obvious "tricks" that have found their 
way into the execution plans. Most of these observations 
are closely related to query evaluation techniques that have 
originally been described as XPath-specific [5], [14], [16], 
outside the relational domain. Since we have transferred all 
responsibility for the XQuery runtime aspects to the RDBMS, 
we think this is quite interesting. 



The optimized DB2 execution plan found for Query Qi of 



Section II-D is shown in Fig. 10 We are reproducing these 
execution plans in a form closely resembling the output of 
DB2's visual explain facility. Nodes in these plans represent 
operators of DB2's variant of physical algebra — all operators 



relevant for the present discussion are introduced in Table VII 



Path stitching and branching. Consider the B-tree index with 
key nksp. Due to its nk prefix, this index piimaiily provides 
support for XPath name and kind tests. In the execution plan 



[return] 
[tbscan]] 
[sort] 
[ ^nljoin] ! 



(nuo\n } [ [ixscan] ] 

^ — r^]ii ^<p|nvkl 

[nuoinJ [ixscan]^ 
[ixscanJ [ixscanJ 

<^nksp^ /nkspi 




^2 ■ : node () 
(serialization) 



,1^: : open_auction 



doc ( " auction . xml ") '/descendant). 



(/d-o-s;^ 
r/chii'd' 



Fig. 10. DB2 V9 execution plan for |Qi| with continuation annotations (d-o-s 
abbreviates descendant-or-self). 



for Qi the index is used to access the requested document 
node (name = 'auction. xml' A kind = DOC). Additionally, 
however, the index delivers the infoset properties sp and thus 
provides all necessary information to step along the XPath 
descendant axis — namely the interval (pre, pre + size], see 
Fig. [3] — from those nodes that have been found during index 
lookup. In the following, we will denote the result of such 
index lookups as 



doc ( "auction . xml" )'ydescendantjj^ 

or, generally nC/a)^ (read: perform the specified node test 
n, then prepare a subsequent step along axis a). In the 



execution plan of Fig. 10 the bottom index nested-loop join 
continues this "half-cooked" step: a lookup in the index 
nkspl retrieves columns nkp to (1) perform the due name 
and kind test (name = ' open_auction' A kind ~ ELEM) 
and (2) complete the structural descendant axis traversal 
(check p for containment in the (pre, pre + size] interval 
obtained in the first half of the step). In the annotated plans, 
we write ,\ : : open_auction (read: resume axis step and 
perform specified name and kind test). Stitched at the matching 
continuation points (here: those with subscript 1), we obtain 
the complete XPath location step again: 

doc("auction.xml")i/descendantj-|^: : open_auction . 

The lookup in index nkspl further provides the necessary 
infoset properties to prepare the now cuiTent continuations 
i/descendant-or-seifjg (columns sp) as well as r/chiidlg 
(columns spl). Such continuations with multiple resumption 
points are the equivalent of the branching nodes discussed in 
the context of holistic twig joins [5]. 

Given the tailored B-tree index set in Table |VI] the DB2 
query optimizer consistently manages to select the index 
access path that provides just the required XML infoset 
properties. Resuming the child continuation at [_)^ : : bidder 
requires columns nk to perform the name and kind test plus 
columns pi to complete the evaluation of the range predicate 



8 



[return] 
[ tbscan ] 
(sort") 
I^nljoinJ 



[ 'nljoin^ [ 'ixscan'3 

I ^NLJOIN^ [ IxSCAInT ]' 
I ^NLJOIN^ [ JXSCAN ] 

[ nljoin^ [ jxscan ] 



[nljoin] [Txscan] 



i|nvkls: 

doc I '^-''ii ' 
nlkp5> ' (serialization) 



before the continuation for resumption point ''^ ji has provided 
any context node information. The index key columns spl 
are used to prepare the continuation (7child;2 which is then 
immediately resumed by the node test for element price. 
Index nkdip is deployed to implement this node test as well as 
node atomization and subsequent value comparison (the index 
key contains column data of table doc). At this point, the plan 
has evaluated the path fragment 



[ ^NLJOII\p [TxSCAN] 



[ ^NLJOIINp [TxSCAN] 




doc I KJia- :name;/d-o-s;j^j 



doc I : category < ; 



(/attribute;. 



;/chlld' 



C jn'- :id/dataC.) 



doc I \ }-,. ■■ category(7data_(_._)_;g 



doc I C )q: : incategory;/ attribute)^ 



(/attribute) 



ynlkp> '-^ |;/child;„ 
^ doc I ( :id/data(.) =( )^ 



I ^NLJOIN^ [JXSCAN] 

r 1 ^.yilkpv 

[ nljoin j [ixscanJ ^ 

[X f ^ \ I doc I ( }o: : item(/data( . ) ; 

[hsjoinJ Qxscan J ^ 

f ^^^ ^ r^^ikT^ 

[ jxscan ] Qxscan] ^" 



itemref (/attributeL 



^ ^ I doc I docC"auction.xnil")(/descendant}, 

c^nkspl^ 

I doc I I doc I ( -P^ic® tdataC . ) > 500] 

( )-^: : closed.auctionj/child;^ 

Fig. 11. DB2V9 execution plan for[Q^ 



that implements the child step (again, see Fig. |3] columns 
pre^, sizGo, levelo are provided by the (/childig continuation). 
The optimizer thus selected the index with key nikp. Finally, as 
anticipated, the plan scans index pjnvkls to traverse all nodes 
in the subtrees below the query's XML result nodes (resuming 
from l/descendant-or-selfj2). 

XPath step reordering and axis reversal. Two further 
phenomena in the execution plans can be explained in terms 
of the XPath continuation notion. For \Qi\ the order of 
the XPath location steps specified in the input query did 
coincide with the join order in the execution plan (access 
document node of XML instance auction. xml, then per- 
form a descendant :: open_auction step, finally evaluate 
child: : bidder). The B-tree index entries provide sufficient 
context information, however, to allow for arbitrary path 
processing orders — right-to-left strategies are conceivable as 
are strategies that start in the middle of a step sequence and 
then work their way towards the path's endpoints. The latter 



can be witnessed in the execution plan of Query Q2 (Fig. 1 1 



: closed_auction!/child;2: : price [data( . ) > 500] 

which is still context-less. The due context is only pro- 
vided by the subsequent NLJOIN-IXSCAN pair which verifies 
that the closed_auction elements found so far indeed are 
descendetnts of auction. xml's document node. Observe 
that in this specific evaluation order of the location steps, 
the closed_auction nodes now assume the context node 
role: the plan effectively determines the closed_auction 
elements that have the document node of auction. xml in 
their cincestor axis. The reversal of axes — in this case, 
trading descendsint for ancestor — is based on the dualities 
pre ^ pre^, size ^ sizeo in the predicates asis (ancestor) 
and aa;is(descendajit), defined in Fig. [5] This observation 
applies to all other pairs of dual axes [16] {e.g., parent/child, 
following/preceding) and, due to the attribute encoding 
used in table doc, also to the attribute/owner relationship 
between element nodes and the attributes they own. 

The query optimizer decides on the reordering of paths and 
the associated reversal of XPath axes based on its "classical" 
selectivity notion and the availability of eligible access paths: 
for a B-tree with name-prefixed keys, the RDBMS's data dis- 
tribution statistics capture tag name distribution while value- 
prefixed keys lead to statistics about the distribution of the 
(untyped) element and attribute values. 

this enabled the optimizer to decide 



In the case of Q 



The plan's very first index scan over nkspl evaluates the 
name and kind test for elements with tag closed_auction 



that the access path nkdIp, directly leading to price nodes 
(key prefix nk) with a typed decimal value of greater than 
500 (key column d), is highly selective (only 9,750 of the 
4.7 million nodes in the UOMB XMark XML instance are 
price elements and only a fraction of these has a typed value 
in the required range). 

An analogous observation about the distribution of untyped 
string values in the value column — the key prefix in the vnlkp 
B-tree — has led the optimizer to evaluate the general attribute 
value comparison 

'\_]^: : id/data ( . ) = 
doc ("auction, xml")/- ■ ■ /attribute : : item/data ( . ) 

before the "hole" [ j^ has been filled. The elements owning the 
Sid attributes are resolved subsequently, effectively reversing 
the attribute axis. (This constellation repeats for the second 
attribute value comparison in Q2 resumption point [ Jg.) 

B. Pure SQL vs. pureXML™ 

With DB2 Version 9, IBM released the built-in pureXMLT"^ 
XQuery processor. This opens up a chance for a particularly 
insightful quantitative assessment of the potential of the purely 
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TABLE Vm 

Sample query set taken from [13] (rightmost column shows the 
query identifier used in [13]). we replaced the non-standard 
return-tuple {Qg) BY AN SQL/XML XMLTABLE CONSTRUCT. 



Query 



Data [13] 



XMarlc 9a 

XMarlc 9c 

DBLP 8c 

DBLP 8g 



Qs /site/people/person [Oid = "personO"] 
/name/text 

Qi //closed_auction/price/text 

Qs /dblp/*[@key = "conf /vldb2001" and 
editor and title] /title 

Qs for $thesis in /dblp/phdthesis 

[year < "1994" and 
author and title] 
return-tuple $thesis/title , 
$thesis/ author , 
$thesis/year 



TABLE IX 

Observed result sizes and wall clock execution times 
(average of 10 runs). 





Query 


DB2 + Pathfinder 


DB2 pureXMLTM 






staclced 


join graph 


whole 


segmented 




# nodes 


(sec) 


(sec) 


(sec) 


(sec) 


Qi 


1,625, 157 


63.011 


11.788 


10.073 


9.661 


Q2 


318 


DNF 


0.544 


DNF 


DNF 


Qs 


1 


60.582 


0.017 


0.891 


0.001 


Qi 


9,750 


32.246 


0.309 


6.455 


7.438 


Q5 


1 


442.745 


0.391 


48.066 


0.001 


Qe 


59 


0.026 


0.004 


1.292 


0.017 



relational approach to XQuery processing discussed here: not 
only can a comparison with pureXML''^ be performed on 
the same machine, but even in the context of a single query 
processing infrastructure (implementation of query optimizer, 
plan operators, and B-tree indexes). 

In a pureXML '"^-enabled database, XML documents are held 
in columns of type XML, containing references to pages 
which store a fragmented tree-structured infoset representa- 
tion. Index-based XQuery processing is based on XMLPATTERN 
indexes whose entries are the typed values — cast to a SQL data 
type like DOUBLE or VARCHAR(n) — of nodes that are selected 
by a given non-branching XPath location path of forward steps 
(along the descendant, child, and attribute axes). A new 
plan operator primitive, XISCAN, can deploy such an index to 
evaluate a general value comparison of an XPath expression 
with a literal value (provided that the indexed path covers 
the queried path [2, see index eligibility]). An index scan 
via XISCAN yields RIDs, the identifiers of those rows that 
contain XIVIL documents with matching nodes. Once these 
documents are retrieved, their node trees need to be traversed 
to navigate to the matching nodes. This actual tree traversal 
is implemented in terms of the new XSCAN primitive. 

Although a B-tree with key vnlkp resembles an XMLPATTERN 
index of type VARCHAR (ri) — both index (the string values 
of) atomized XML nodes — the B-tree is deployed quite dif- 
ferently: a lookup yields document order ranks (column pre) 



which may be used (1) to directly access the infoset properties 
of only the matching nodes via index pjnvkls or (2) to continue 
with XPath location step processing. 

To limit the overhead of the post-index-lookup tree traversal 
performed by XSCAN, pureXML^M favors database designs 
that lead to comparably small XML document segments (of a 
few KB, say) per row. Since the tabular XML infoset encoding 
and XQuery processing strategy discussed here can perfectly 
cope with very large XML instances (beyond 100 MB), for 
the sake of comparison we let pureXML'^ operate over both, 
segmented as well as monolithic XML documents. 

The internals of XSCAN are based on the TurboXPath 
algorithm [13]. This is interesting in its own right: TurboXPath 
supports an XPath 2.0-style dialect quite similar to Fig. [TJand, 
in particular, admits nested for loops and XPath predicates, 
but does not implement the full axis feature (TurboXPath 



supports the vertical axes). Aware of XSCAN's innards, to Q] 
and Q2 we added sample queries that directly stem from 



[13] — these queries are displayed in Table [VIII| The resulting 
query set exhibited runtime characteristics representative for 
the much larger query set we investigated in the course of this 
work. 

ran against an XMark instance of 



The Queries Qi Q. 



110 MB (4,690,648 nodes), [Q^ [Q^ queried an XML represen- 
tation of Michael Ley's DBLP publication database (400 MB 
or 31,788,688 nodes). We accommodated pureXML's prefer- 
ence for many but smaller XML documents and cut the whole 
XMark instance into 23,000 segments of 1-6 KB; the DBLP 
instance was segmented into distinct publications, yielding 
about 1,000,000 segments of 30 nodes (ca. 800 bytes) per 
row. To support the pureXML'''^ processing model, we further 
created an extensive XMLPATTERN index family such that at 
least one index was eligible to support the value references 
occurring in the query set (e.g., for Q3 we created an index 
on /site/categories/ category/@id). 

We then translated the query set with Pathfinder, an XQue- 
ry compiler that includes a faithful implementation of loop 
lifting and join graph isolation described in Sections |II-C| 
and 



111 



Pathfinder was configured to emit the SQL code 
derived from both, the original stacked plan and the iso- 
lated join graph. The resulting SQL queries ran against a 
database populated with tabular XML infoset encodings of the 
XMark and DBLP instances, using a B-tree index setup as 



described in Table VI Both, pureXML'^ and Pathfinder used 
the same DB2UDBV9.1 instance hosted on a dual 3.2 GHz 
Intel Xeon'"^ computer with 8 GB of primary and SCSI-based 
secondary disk memory, running a Linux 2.6 kernel. 
The impact of join graph isolation. Table |IX] summarizes 
the average wall clock execution times we observed. For 
for example, isolating the join graph (Fig. [7]) 



Query Q 



from the initial stacked plan (Fig. [4|l yields a five-fold re- 
duction of execution time. Compositional compilation leads 
to tall stacked plans that exhibit a significant number of 
intermediate g and S operators. Pathfinder translates such 
plans into a SQL common table expression (WITH . . . ) that 
features an equally large number of DISTINCT and RANKO 
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OVER (ORDER BY. . . ) clauses. Given this, DB2 V9 generates 
execution plans with numerous SORT primitives followed by 
temporary table scans. This is different for the join-graph- 
derived plans which compactly encode the XQuery duplicate 
and order semantics in terms of a single SORT operator (see 
Figures 10 and 111. Similar and even more drastic effects 



could be observed for the other queries: join graph isolation 
lets IQ2 [ execute in about 1/2 second (formerly the query did not 
complete within 20 hours), while the execution time of [Qaf - 
IQ5 1 improved by two to three orders of magnitude. Query \Qq\ 
performs an early very selective tag name test (phdthesis) 
and thus showed an improvement of only a factor of six. 



Table IX further assesses how DB2pureXML''''^ fares against 
its Pathfinder-dnven relational self. For Qi the universally 



high execution times of about 10 seconds largely reflect 
the substantial effort to support serialization: the resulting 
node sequence contains 3,249 opeii_auctioii elements, each 
being the root of a subtree containing 500 nodes on aver- 
age. Query Q4 primarily relies on raw path traversal per- 



formance as no value-based index can save the query en- 
gine from visiting a significant part of the XML instance. 
The more than 20-fold advantage of Pathfinder suggests 
that B-tree-supported location step evaluation will remain a 
true challenger for the XSCAN-based implementation inside 
pureXMLT'^. Queries Q5 (and Qq to some extent) yield 



singleton (short) node sequences and constitute the best case 
for the segmented pureXML™ setup: here, XMLPATTERN index 
lookups return a single or few RlD(s), directly leading the 
system to small XML segment(s) — the remaining traversal 
effort for XSCAN then is marginal. For the whole document 
setup, however, an index lookup could only point to the single 
monolithic XML instance: XSCAN thus does all the heavy 



work (the wildcard * in forces the engine to scan the entire 
400 MB DBLP instance). Despite the extensive index options 
available to support Q2 pureXML^M is not able to finish 



evaluation within 20 hours: the system appears to miss the op- 
portunity to perform value-based selections and joins early (re- 



call the discussion of Fig. 1 1 1 and ultimately is overwhelmed 



by the Cartesian product of all closed_auction, item, and 
category elements. The indexes largely remain unused (the 
predicate price [data( . ) > 500] is, in fact, evaluated second 
to last in the execution plan generated by pureXML'"^). 

The sub-second execution times observed for Pathfinder 
indicate that the effort to compile into particularly simply- 
shaped self -join chains pays off. The DB2 V9 built-in monitor 
facility provides further evidence in this respect: the queries 
enjoy a buffer cache hit ratio of more than 90 % since merely 
table doc and indexes fight for page slots. 

V. More Related Work 

One key ingredient in the join graph isolation process 
are the rewrites that move order maintenance and duplicate 
elimination into tail positions. Their importance is underlined 
by similar optimizations proposed by other research groups 
[9], [18], [21]: Fernandez et al. remove order constraints and 



duplicates based on the XQuery Core representation [9] — 
an effect achieved by Rules (|2]i and (|6]l of Figure |5] The 
principal data structure of XQuery Core — item sequences — 
however prohibits merging of multiple orders as in Rule ( [T7] i. 

In [21], a algebra working on ordered tables is the subject of 
order optimization. An order context framework provides min- 
imal ordered semantics by removing — much like Rule (|2| — 
superfluous Sortby operators. In addition, order is merged in 
join operators and pushed through the plan in an Orderby Pull 
up much like in Section|ni] In the presence of order-destroying 
operators such as 5 they however fail to propagate the order 
information to the plan tail (compare to Rule ([T4]i). 

An extension of the tree algebra (TLC-C) in the research 
project Timber introduces order on a global level [18] and 
generates tree algebra plans that — if tree patterns are mapped 
to self-join chains — might be transformed into SQL queries 
similar to join graph isolation. 



In Section IV-A we have seen how a selectivity-based 



reordering of XPath location steps can also lead to a re- 
versal of axes. In effect, the optimizer mimics a family of 
rewrites that has been developed in [16]. These rewrites 
were originally designed to trade reverse XPath axes for their 
forward duals, which can significantly enlarge the class of 
expressions tractable by streaming XPath evaluators. Here, 
instead, we have found the optimizer to exploit the duality 
in both directions — in fact, a descendant axis step has been 
traded for an ancestor step in the execution plan for IQ2I 



(Fig. 111. The evaluation of rooted /descendant : :n steps — 
pervasively introduced in [16] to establish a context node set 
of all elements with tag n in a document — is readily supported 
by the n-prefixed B-tree indexes. Since the XQuery compiler 
implements the full axis feature, it can actually realize a 
significant fraction of the rewrites in [16]. 

Although we exclusively rely on the vanilla B-tree indexes 
that are provided by any RDBMS kernel, cost-based join 
tree planning and join reordering leads to a remarkable plan 
versatility. In the terminology of [14], we have observed the 
optimizer to generate the whole variety of Scan (strict left- 
to-right location path evaluation), Lindex (right-to-left evalu- 
ation), and Bindex plans (hybrid evaluation, originating in a 
context node set established via tag name selection; cf. the 



initial closed_auction node test in Fig. 11 



The path branching and stitching capability (Section IV- 
[A| makes the present XQuery compilation technique a distant 
relative of the larger family of holistic twig join algorithms 
[5], [6], [7]. We share the language dialect of Fig. [T] — coined 
generalized tree pattern queries in [5], [7] — but add to this 
the full axis feature. Quite differently, though, we (1) let the 
RDBMS shoulder 100% of the evaluation-time and parts of 
the compile-time effort invested by these algorithms {e.g., the 
join tree planner implements the findOrder{-) procedure of [7] 
for free), and (2) use built-in B-tree indexes over table-shaped 
data where TwigStack [5] and Twig^Stack [6] rely on special- 
purpose runtime data structures, e.g., chains or hierarchies of 
linked stacks and modified B-trees, which call for significant 
invasive extensions to off-the-shelf database kernels. 
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Finally, this work may be read as one possible response to a 
list of open issues identified in the context of cost-based XQue- 
ry processing with DB2 V9 pureXML'^ (here, we directly refer 
to the specific issues raised in [3, page 316 ff.]): 
XML index exploitation The infoset encoding (Fig. [2]i is truly 

node-based: location paths may originate in any individual 

node and the document order rank (column pre) is sufficient 

to "point into" a document. 



References 



Deferred XPath evaluation The B-tree index keys (Table VI I 
are self-contained: XPath processing may resume without 
any consultation of a document's infoset encoding. 

Cost estimation Since the generated executions plans exclu- 



sively feature the well-known operators (Table Vlli, estab- 
lished procedures for statistics collection and cost estimation 
remain applicable [20] (but see Section [yi]). 
Order optimization Join graph isolation leads to a sound — 
also in the presence of nested for loops — yet compact 
representation of the XQuery order semantics in the plan 
tail, which ultimately finds its way into a single simple SQL 
ORDER BY clause. 

VI. Work in Flux 

This work rests on the maturity and versatility of database 
technology for strictly table-shaped data, resulting from 30+ 
years of experience. We (1) discussed relational encodings of 
the true XQuery semantics that are accessible for today's SQL 
query optimizers, but (2) also saw that some care is needed to 
unlock the potential of a set-oriented query processor 

In the context of the open-source Pathfinder project, we 
continue to pursue the idea of a purely relational XQuery 
processor On the workbench lie DB2's statistical views — 
in our case, pre-formulated descendcint and child location 
steps for which the system records statistical properties but 
not the result itself — which promise to give insight into the 
structural node distribution of an encoded XML document. 
This may further improve join tree planning. 

The scope of this work reaches beyond XQuery. Tall stacked 
plan shapes with scattered distributions of g operators (Fig. |4]) 
also are an artifact of the compilation of complex SQL/OLAP 
queries (in which functions of the RANKO family are perva- 



sive). The observations of Section IV suggest that the rewriting 
procedure of Fig. [5] can benefit commercial query optimizers 
also in this domain. 
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F; loop \- q , ^ 

(DOC) ^.,_„^,„.,.„ ^ „ 7I^\^(DDO) 



F; loop I- doc(Mri) l=> F; loop h fs:ddo(e) l=> £ipos;(item) (5(7riter,item(g))) 

7riter,pos,item;pre (o"(kind=DOC) A(name=uri ) (doc) X @ pos: 1 (loOp) ) 

{. . . ,$Vi 1-^ Qv^, . . .}; loop I- Cif l=> gi/ loop^y = S (7riteri:iter((3i/)) 

{. . . , 1-^ 7riter,pos,item(loOP;f Niteri=iter • ■ ■ } ; loop^f I- etfteji l=> g ©£{ = ,!=,<,<=,>,>=} F; loOp h 6 l=> g 

(IP) T^.i u „^ „.„;^ (COMP) 



{. . . , $Ui Qvi, ■ ■ ■}; loop I- if (fn : boolean (e;/ ) ) thenethen else () l=> g F; loop he© val l=> 

@item:l (©pos:! (5 (TTiter (o-yaluegl^ai (^OC M pre=item (?))))) 

{. . . , gi,; , . . . }; loop h ei„ l=> g^ 

g$a; — #inner(^m) map — TTouteriiter, inner, sortrpos (gSs) 

{■ . ■ , I * 7riter;inner,pos,item (map 1X1 outer— iter ) ? ■ • • } 

{$a; I— > @pos;l (7riter:inner,item(g$i))} ; TTiteriinner (ma p) h e^ct l=> g ,^ , , n 

■(POI^) 7 . . „ T .1 I (VAR) 



{. . . , I— > g„;, . . . }; loop I- for $x in em return erct l=> {. . . , $x i-^ g, . . . } ; loop h q 

7riter:outer,pos:posi,item (£?posi : (sort,pos) {Q ^iter— inner ^3p)) 



F; loop I- e l=> g 



F; loop I- e/a: :n l=> 

t?pos; (item) (writer, item;pre kindt(n) Anamet{n) {^^^) ^ axis (a) ("^iterjpreQ :pre,sizeo :size,levelo :ievel (doC txl ppg— item Q)))) 



(Step) 



Fig. 13. Rules defining the compilation scheme F; loop h e l=> g from XQuery expression e to algebraic plan q. 



Appendix 

A. Inference Rules 

The inference rule set of Fig. 13 (adopted from [11]) imple- 
ments a loop-lifting XQuery compiler for the XQuery subset 
in Fig. Staking into account the XML encoding sketched in 
Section^ The rule set defines a judgment 

F; loop h e 1=^ g , 

indicating that the XQuery expression e compiles into the 
algebraic plan q, given 

(1) r, an environment that maps XQuery variables to their 
algebraic plan equivalent, and 

(2) loop, a table with a single column iter that invariantly 
contains n arbitrary but distinct values if e is evaluated in 
n loop iterations. 

An evaluation of the judgment 0;™ h eo go invokes 
the compiler for the top-level expression eq (the singleton 
loop relation represents the single iteration of a pseudo loop 
wrapped around gq). The inference rules pass F and loop top- 
down and synthesize the plan q^ in a bottom-up fashion. A 
serialization operator at the plan root completes the plan to 
read 9(90)- 
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